Value of clinical tests in diagnosing anterior cruciate ligament injuries: A systematic review and meta-analysis

Objectives: This study compared 4 clinical tests with reference to magnetic resonance imaging and arthroscopic visualization to comprehensively evaluate their diagnostic value for anterior cruciate ligament injuries. Methods: We systematically searched 10 electronic databases from January 1, 2010, to May 1, 2021. Two reviewers collected data in accordance with the Preferred Reporting Item for Systematic Reviews and Meta-Analyses 2020 guidelines. The quality of each study was assessed using the Quality Assessment of Diagnostic Accuracy Studies 2 tool. A meta-analysis was performed using Meta-Disc version 1.4 and Stata SE version 15.0. Results: Eighteen articles involving 2031 participants were included. The results of the meta-analysis showed that for the Lachman test, the pooled sensitivity, specificity, positive likelihood ratio, negative likelihood ratio, diagnosis odds ratio, area under the curve (AUC) of summary receiver operating characteristic (SROC), and Q* were 0.76 (95% CI, 0.73–0.78), 0.89 (95% CI, 0.87–0.91), 5.65 (95% CI, 4.05–7.86), 0.28 (95% CI, 0.23–0.36), 22.95 (95% CI, 14.34–36.72), 0.88, and 0.81, respectively. For the anterior drawer test, the pooled sensitivity, specificity, positive likelihood ratio, negative likelihood ratio, diagnosis odds ratio, AUC of SROC, and Q* were 0.64 (95% CI, 0.61–0.68), 0.87 (95% CI, 0.84–0.90), 3.57 (95% CI, 2.13–5.96), 0.44 (95% CI, 0.32–0.59), 8.77 (95% CI, 4.11–18.74), 0.85, and 0.78, respectively. For the pivot shift test, the pooled sensitivity, specificity, positive likelihood ratio, negative likelihood ratio, diagnosis odds ratio, AUC of SROC, and Q* were 0.59 (95% CI, 0.56–0.62), 0.97 (95% CI, 0.95–0.98), 13.99 (95% CI, 9.96–19.64), 0.44 (95% CI, 0.35–0.55), 29.46 (95% CI, 15.60–55.67), 0.98, and 0.94, respectively. For the lever sign test, the pooled sensitivity, specificity, positive likelihood ratio, negative likelihood ratio, diagnosis odds ratio, AUC of SROC, and Q* were 0.79 (95% CI, 0.75–0.83), 0.92 (95% CI, 0.87–0.95), 9.56 (95% CI, 2.76–33.17), 0.23 (95% CI, 0.12–0.46), 47.38 (95% CI, 8.68–258.70), 0.94, and 0.87, respectively. Conclusions: Existing evidence shows that these clinical tests have high diagnostic efficacy for anterior cruciate ligament injuries, and that every test has its own advantages and disadvantages. However, the above results should be validated through additional studies, considering the limited quality and quantity of our sample.


Introduction
Clinically, anterior cruciate ligament (ACL) injuries are common athletic injuries that may compromise the stability of the knee joint. These injuries are usually caused by direct or indirect trauma of the knee joint. Along with economic development, the awareness of physical exercise is increasing. However, the number of patients with ACL injuries is also increasing. In the United States, approximately 35,000 procedures are performed annually for ACL reconstruction. [1] Globally, the number of these procedures exceeds one million per year. [2] Clinically, accurate early diagnosis of ACL injuries is important for selecting an appropriate treatment regimen and improving prognosis. Currently, clinical tests-the Lachman test (LT), anterior drawer test (ADT), pivot shift test (PST), lever sign test (LST)-and imaging procedures, such as radiography, ultrasound, computed tomography, magnetic resonance imaging (MRI), and arthroscopic visualization, are the most common methods for diagnosing ACL injuries. Radiography, ultrasound, and computed tomography are not precise in determining the location of injury and measurement. Moreover, for relevant operations, the requirements for operator's experience are relatively high. MRI is expensive, with a possibility of measurement errors. Although invasive, arthroscopic visualization is the reference standard for the diagnosis of ACL injuries. Therefore, clinical testing is crucial. This study collected published articles on the 4 clinical tests (LT, ADT, PST, and LST) for the diagnosis of ACL injuries, in both English and Chinese worldwide. A meta-analysis was conducted to evaluate the value of the clinical tests for the diagnosis of ACL injuries to provide a reference for the early clinical diagnosis of ACL injuries.

Methods
This systematic review was conducted in accordance with Preferred Reporting Item for Systematic Reviews and Meta-Analyses (PRISMA) 2020 guidelines [3] and was prospectively registered in International Prospective Register of Systematic Reviews (PROSPERO) (registration number: CRD42021256253).
The PubMed, Cochrane Library, Embase, Web of Science, CNKI, Wangfang Data, VIP, CBM, Chinese Clinical Trial Registry, and the ClinicalTrials.gov electronic databases were searched to collect relevant studies from January 1, 2010, to May 1, 2021. Furthermore, reference lists of the included studies were reviewed to supplement the relevant data.

Inclusion and exclusion criteria
2.1.1. Inclusion criteria. We included articles on the clinical tests for the diagnosis of ACL injuries in English or Chinese that have been published to date in which all the participants received tests as a reference standard, and the test results were explicitly diagnosed. There were no limitations regarding sex and age of the participants.

Exclusion criteria.
For repeated articles, only the latest and most complete data were included. Additionally, the following articles were excluded: fundamental studies such as animal experiments, systematic reviews, conference papers, abstracts, lectures, and case reports; studies with unclear measurements, inappropriate statistical methods, or insufficiently described important outcome indicators; and literature with the results that cannot be extracted directly or indirectly.

Data collection
All of the retrieved articles were imported into the NoteExpress version 3.4 to find duplicate articles automatically. The remaining articles were screened primarily by reading the abstracts. Fully relevant texts were downloaded, and articles that met the relevant requirements, according to the abovementioned inclusion and exclusion criteria, were selected. Article screening, data extraction, and cross-checks were independently conducted by 2 reviewers. Differences, if any, were resolved through discussion or negotiation with a third reviewers. The following information was extracted from each of the studies: title; first author; journal of publication; baseline characteristics and diagnostic information of the participants; key elements of risk of bias assessment; and outcome indicators, including the values of true positive, false positive, false negative, and true negative, which were calculated or acquired directly.

Quality assessment
The risk of bias for the included studies was assessed using the Review Manager version 5.3, based on the Quality Assessment Tool of Diagnostic Accuracy Studies 2 (QUADAS-2). [4] The results were further cross-checked independently by the 2 reviewers. Differences, if any, were resolved through discussion. Each item was regarded as "yes" (low bias or good suitability), "no" (high bias or poor suitability), or "unclear" (lack of relevant information or uncertainty regarding the bias).

Data analysis
Meta-Disc version 1.4 and Stata SE version 15.0 were used for the meta-analysis. The presence of a threshold effect was further tested using Spearman correlation analysis. A significant positive correlation between sensitivity and (1-specificity) indicated the presence of a threshold effect. Statistical heterogeneity among the studies was analyzed using the chi-squared test, and the magnitude of heterogeneity was determined based on I 2 values. In the case of statistical heterogeneity among the studies (I 2 > 50%), a random-effects model was used for the pooled analysis after excluding significant clinical heterogeneity through meta-regression or subgroup analysis; otherwise, a fixed-effects model was used for the pooled analysis. Based on the corresponding model, the pooled sensitivity (Sen), specificity (Spe), positive likelihood ratio (+LR), negative likelihood ratio (−LR), and diagnostic odds ratio (DOR) of the included studies were calculated. The summary receiver operating characteristic (SROC) was further plotted, and the area under the curve (AUC) and Q* were calculated. The included studies were then excluded individually for the sensitivity analysis. If the results of the meta-analysis differed from the results of previous studies, the stability of the included studies was good; otherwise, the stability of the included studies was poor. In conclusion, Deek funnel plot was used to examine publication bias, with P > .05 indicating no publication bias for the included studies; otherwise, there was a publication bias.

Risk of bias assessment in the included studies
The results of the QUADAS-2 evaluation of the quality of the included articles showed that the risk assessment in 4 aspectspatient selection, index test, reference standard, and flow and timing-was relatively unsatisfactory. The risk bias was relatively high, especially in terms of the index test and flow and timing. As for the index test, 14 articles either disregarded thresholds or the thresholds were not prespecified. In terms of flow and timing, for 9 of the articles, some cases were not included in the relevant analysis. This resulted in bias in the articles included in this study (Figs. 2 and 3; Supplemental Table 1, http://links.lww.com/MD/G869).

Sensitivity analysis.
A sensitivity analysis was conducted for the remaining studies after separately screening individual studies. The results showed that the effect of each eliminated study on the pooled effect size was relatively small, indicating that the results of this study were robust and the confidence level of the analysis results was high (Fig. 5).

Analysis of the publication bias.
Regarding the included studies concerning LT for the diagnosis of ACL injuries, a Deek's funnel plot was given, with the inverse of the square root of the effective sample size (1/ESS1/2) as the vertical coordinate, and DOR as the abscissa coordinate. [23] The result showed that there was no publication bias for LT (P = .83) (Fig. 6).  (Fig. 7). The cause of heterogeneity was not found through meta-regression or subgroup analysis; therefore, the effect sizes were pooled using a random-effects model.  (Fig. 7).

Sensitivity analysis.
A sensitivity analysis was conducted for the remaining studies after separately screening individual studies. The results showed that the effect of each eliminated study on the pooled effect size were relatively small, indicating that the results of this study were robust and that the confidence level of the analysis results was high (Fig. 8).

Analysis of the publication bias.
Regarding the included studies concerning ADT for the diagnosis of ACL injuries, a Deek's funnel plot was given, with the inverse of the square root of the effective sample size (1/ESS1/2) as the vertical coordinate, and DOR as the abscissa coordinate. [23] The result showed that there was no publication bias for ADT (P = .20) (Fig. 9).

Sensitivity analysis.
A sensitivity analysis was conducted for the remaining studies after separately screening individual studies. The results showed that the effect of each eliminated study on the pooled effect size were relatively small, indicating that the results of this study were robust and that the confidence level of the analysis results was high (Fig. 11).

Analysis of the publication bias.
Regarding the included studies concerning PST for the diagnosis of ACL injuries, a Deek funnel plot was given, with the inverse of the square root of the effective sample size (1/ESS1/2) as the vertical coordinate, and DOR as the abscissa coordinate. [23] The result showed that there was no publication bias for PST (P = .29) (Fig. 12).

Meta-analysis of LST
Six articles, with 7 studies on 590 participants, were included in the meta-analysis (  (Fig. 13). The cause of heterogeneity was not found through meta-regression or subgroup analysis; therefore, the effect sizes were pooled using a random-effects model.  (Fig. 13).

Sensitivity analysis.
A sensitivity analysis was conducted for the remaining studies after separately screening individual studies. The results showed that the effect of each eliminated study on the pooled effect size were relatively small, indicating that the results of this study were robust and that the confidence level of the analysis results was high (Fig. 14).

Analysis of the publication bias.
Regarding the included studies concerning LST for the diagnosis of ACL injuries, a Deek funnel plot was given, with the inverse of the square root of the effective sample size (1/ESS1/2) as the vertical coordinate, and DOR as the abscissa coordinate. [23] The result showed that there was no publication bias for LST (P = .77) (Fig. 15).

Discussion
The results of previous studies were combined quantitatively using a meta-analysis, in which the results of previous relevant independent studies were reviewed critically and combined statistically, and similar results were integrated quantitatively. Through a comprehensive evaluation of the inconsistency or contradiction of the study results, the sample size may be enlarged; the power of statistical tests may be advanced; and the shortcomings of previous studies may be simultaneously identified, thereby revealing the uncertainties of individual studies, and putting forward new topics and interests for relevant studies.     This study encompassed 18 articles, with arthroscopy, surgical exploration, and MRI as the reference standards for clinical tests in diagnosing ACL injuries. Our meta-analysis showed that the pooled sensitivity of LT, ADT, PST, and LST for diagnosing ACL injuries were 0.76, 0.64, 0.59, and 0.79, respectively, whereas the pooled specificity were 0.89, 0.87, 0.97, and 0.92, respectively. This suggests that the capability of the 4 clinical tests to diagnose ACL injuries was high. LRs are defined as the likelihood that a particular test result would be found in a patient with the target disorder, relative to the likelihood of the same test result occurring in a patient without the target disorder; the positive likelihood ratio-that is, the ratio of the true-positive rate to the false-positive rate for the screening results-indicates the ratio of probability of the screening test having a correct judgment for a positive result and its probability of wrong judgment for a positive result. A higher +LR indicates a greater probability that a positive test result is a true positive result. The negative likelihood ratio-that is, the ratio of the false-positive rate to the true-positive rate for the screening results-indicates the ratio of the probability of the screening test having a wrong judgment for a positive result to its probability of correct judgment for a positive result. A lower −LR indicates a greater probability that a negative test result is a true negative result. [24] The pooled +LRs of LT, ADT, PST, and LST for diagnosing ACL injuries were 5.65, 3.57, 13.99, and 9.56, respectively, suggesting that when LT, PST, and LST diagnose ACL injuries as positive, the possibility of ACL injuries is high. The pooled −LRs of LT, ADT, PST, and LST for diagnosing ACL injuries were 0.28, 0.44, 0.44, and 0.23, respectively, indicating that when the 4 clinical tests are negative for the diagnosis of ACL injuries, ACL injuries are very likely to be excluded when there is a negative result on the 4 clinical tests. The DOR is the ratio of the odds of disease in positive tests relative to the odds of disease in negative tests. The value of DOR ranges from 0 to infinity, with higher values indicating better discriminatory test performance. A value of 1 means that a test does not discriminate between patients with and without the disorder. Values lower than 1 indicate improper test interpretation (more negative tests among the diseased). [25] The DORs of LT, ADT, PST, and LST for the diagnosis of ACL injuries were 22.95, 8.77, 29.46, and 47.38, respectively, suggesting that the accuracy of the 4 clinical tests for the diagnosis of ACL injuries is high. The SROC considers both sensitivity and specificity, and comprehensively compares several clinical tests for importance on the basis of the AUC of the SROC. In terms of the AUC, the larger the value, the more important it is. [26] The AUCs of LT, ADT, PST, and LST for diagnosing ACL injuries were 0.88, 0.85, 0.98, and 0.94, respectively; the Q* values of LT, ADT, PST, and LST for diagnosing ACL injuries were 0.81, 0.78, 0.94, and 0.87, respectively, which indicates that the 4 clinical tests have a high diagnostic efficiency for ACL injuries. ADT is a significant method for the clinical diagnosis of ACL injuries. However, despite its wide use in clinical practice, this method has some limitations. On one hand, acute patients often cannot cooperate effectively, owing to intra-articular hematoma and local pain in the affected limb; moreover, the knee joint cannot maintain flexion at 90°. On the other hand, when the knee joint is flexed at 90°, the meniscus attached to the medial tibia adheres to the convex surface of the medial femoral condyle at the posterior angle, inducing a "door stopper" effect and preventing the tibia from moving forward, which results in a false-positive result. [27] Furthermore, when the posterior cruciate ligament relaxes or ruptures, the tibia may move forward, simply for the return of the femur from the place of subsidence to the medial starting position, which may cause misdiagnosis. [12] Ostrowski et al [28] reported that the overall sensitivity of ADT was only 20% (range, 18%-92%), while the specificity was 88% (range, 78%-98%). Benjaminse et al [29] reported that ADT could yield good results in chronic patients, with a sensitivity of 0.92 (95% CI, 0.88-0.95) and specificity of 0.91 (95% CI, 0.87-0.94). LT can be considered as an ADT of 15° flexion. ACL injuries were determined mainly by observing the movement degree of tibia and femur on the anterior and posterior axes at 15° knee flexion. [12] Therefore, LT can be used to examine patients with acute joint swelling, pain, and inability to flex the knee to 90°. When the knee joint flexes at 15°, the relatively flat joint of the femur no longer blocks the forward movement of the meniscus and tibia, thereby overcoming the disadvantages of ADT. [30] However, when the posterior cruciate ligament relaxes or ruptures, misdiagnosis is also Figure 15. Funnel plot of LST for the diagnosis of ACL injuries. ACL = anterior cruciate ligament, LST = lever sign test. Medicine possible with LT. [31] Rosenberg et al [32] investigated the effect of clinical examinations on ACL tension and found that LT with the knee bent at 15° could produce the maximum tension in most ACL areas, while ADT with the knee bent at 90° could not produce the maximum tension in any part of the ACL. The principle of PST is based on imitation of the mechanism of ACL injuries. Therefore, PST is often affected by the patient's muscle tension, protective response induced by pain, and range of motion, which significantly compromises the accuracy of the examination. However, under anesthesia, PST is relatively reliable. [12] In fact, LST partially utilizes the lever principle first proposed by the ancient Greek scientist Archimedes in his "On the Equilibrium of Planes"; volume 1 of the work contains "the law of the lever," which states that to balance the lever, the 2 torques acting on the lever (the product of the force and the moment arm) must be equal, that is, the power × power arm = resistance × resistance arm. In this process, the lever passing through the fulcrum may provide force conduction. One of the prerequisites for the principle and formula is the integrity of the lever. After ACL rupture, the downward pressure exerted on the thigh cannot move the weight of the leg and foot through the lever formed by the knee joint and the calf, as the continuous transmission of the lever force has been destroyed, and at this point, there will be a positive result of the lever test. The lever test can overcome the disadvantages of the above 3 tests: it does not require much experience for the examiner, the procedure is simple, and the patient's pain is not increased. Lelli et al reported that the sensitivity of LST in the diagnosis of chronic complete ruptures of the ACL was close to 100%, especially in the diagnosis of acute and partial ruptures of the ACL. Moreover, the sensitivity was significantly higher than that of the other clinical tests. [33]

Limitations of this study
All of the articles included in this study were written in English and Chinese; therefore, there was a certain selection bias. Differences in the number of included patients, as well as their sex, age, and degree of ACL injuries, may have caused high heterogeneity. There were certain differences in the diagnostic criteria and reference standards between various studies. The differences in duration between clinical tests and reference standard tests may result in interpretation bias, which may have also affected the results of this study. By meta-regression, we did not find the source of the heterogeneity, which implies that the accuracy of the 4 clinical tests in the diagnosis of ACL injuries significantly depends on the skill and experience of the operators and the severity of the injuries.

Conclusion
In summary, the 4 clinical tests have a certain value in the diagnosis of ACL injuries. Moreover, each clinical test has both strengths and limitations. In clinical practice, the 4 clinical tests can be integrated to improve diagnostic performance. Considering the limitations in the number and quality of the included studies, relevant conclusions still need to be verified through more high-quality studies.